We can add the concept of “momentum” to vanilla gradient descent by including a scaled term corresponding to the change in the parameters from the last iteration
Notice that the velocity vector
1 min read
We can add the concept of “momentum” to vanilla gradient descent by including a scaled term corresponding to the change in the parameters from the last iteration
Notice that the velocity vector